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ABSTRACT 

Responses to American College Test College Outcome 
Measures Program (ACT-COMP) items by 481 black and 9,237 white 
students at the University of Tennessee (Knoxville) were analyzed 
using Samejima's graded model to determine the level of 
differential item functioning (DIF) . Students had been tested using 
Form 8 of the ACT-COMP objective test either as freshmen or as 
seniors. The test contains 60 multiple-choice items, each of which 
has two correct answers. The model developed by Samejima (1969) for 
graded responses, which uses a series of binary models to describe 
polychotomous data, was used to assess the data. Student response 
patterns were fitted to the graded model and five items that did not 
fit the model were dropped. The remaining items were analyzed using 
threshold parameters and their standard errors to calculate 
difficulty -shift coefficients. Results indicate that: (1) for 32 of 
the 55 remaining items, significant instances of DIF are present; (2) 
instances of DIF are not evenly distributed among the six subscales 
of the ACT-COMP test; (3) questions desiqned to assess explanation 
skills produce higher rates of DIF than cio questions designed to 
assess skills related to identification and description; and (4) 
activities that rely on blueprints, require interpretation of satire, 
or use a radio news format to produce high levels of DIF. Four data 
tables and nine graphs are provided. (TJH) 
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Surveys of current assessment practice consistently find that colleges 
and universities ma.ce extensive use of student achievement data to evaluate 
the quality and effectiveness of their education programs (Boyer, Ewell, 
Finney, & Mingle, 1987; El-Khawas, 1988; National Governors' Association, 
1988). These achievement data almost always are used to examine differences 
between institutional means and nacional norms or differences between programs 
at the same institution. 

Differential item functioning (Dif) refers to a situation in which an 
identifiable subgroup performs better (or worse) on a set of test questions 
than do other subgroups. Such a situation represents a serious threat to the 
validity of the comparisons made in assessment research because differences in 
the performance of subgroups may produce variance in achievement scores that 
is not related to program quality (Thissen, Steinberg, & Wainer, 1988). 
Consequently, programs may be incorrectly judged to be effective or ineffec- 
tive depending on whether certain subgroups are overrepresented in the pro- 
grams . 

The performance funding guidelines adopted by ti Tennessee Higher Educa- 
tion Commissi on (THEC) in 1983, and revised in 1986, provide an example of how 
differential item functioning can adversely effect assessment efforts. These 
guidelines currently provide a financial supplement of up to 5% of an institu- 
tion's budget for instruction, and the standard on learning in general educa- 
tion determines one- fifth of this total, or approximately $1 million (Pike & 
Banta, 1987). Awards in general education are based, in part, on institution- 
al means (national percentile ranks) on the College Outcome Measures Program 
(COMP) examination (Banta, 1988) . In addition, the performance funding stan- 
dard on corrective measures requires that institutions use subscores on the 
COMP exam to implement program changes that will improve total scores on the 
exam. 

Because public institutions in Tennessee vary greatly in terms of the 
characteristics of their student populations, differences in the performance 
of subgroups on the COMP exam may significantly influence program improvement 
efforts and the money received through the performance funding guidelines. 
For example, if black students perform differently than whites on the COMP 
exam, judgments about program effectiveness and allocations of money will be 
influenced by the proportion of black students an institution tests during a 
given year . 

Phillippi (1989) reports that performance on the COMP exam is signifi- 
cantly different for black and white students at the University of Tennessee, 
Knoxville (UTK) . In separate analyses for freshmen and seniors, he finds that 
the mean total score on the COMP exam is 10 points lo\ t for blacks than 
whites, even after controlling for the effects of entering achievement levels 
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(ACT Assessment scores) and age. Phillippi also notes that there are signifi- 
cant differences in the means for black and white students on the subscales of 
the COMP exam. While these results strongly suggest that items on the COMP 
exam function differently for blacks and whites, they do not indicate which 
items are involved nor do they provide information about the magnitude of the 
differences . 

Although the analysis of covariance techniques employed by Phillippi can 
be used to identify instances of differential item functioning, several au- 
thors suggest that the technique? of item response theory (IRT) are superior 
to analyses based on general linear models (Burrill, 1982; Camili & Shepard, 
1987). Accordingly, the present research uses techniques from item response 
theory to evaluate differential item functioning for blacks and whites on the 
COMP exam. In the context of item response theory, differential item func- 
tioning is defined as statistically significant differences in the item char- 
acteristic curves (ICCs) for black and white subgroups (Thissen, Steinberg, & 
Wainer, 1988). 



Methods 



The Students 

Analyses of the questions on the COMP exam are based on the responses of 
481 black and 9237 white students at UTK who have been tested using Form 8 of 
the COMP Objective Test either as freshmen or as seniors. Approximately 52% 
(5040) of the total sample is comprised of freshmen, with 304 (6%) of the 
freshmen being black and 4736 (94%) of the freshmen being white. Of the 4678 
seniors tested, 177 (4%) are black and 4501 (96%) are white. 



The COMP Exam 

In 1976, the American College Testing Program (ACT) organized the College 
Outcome Measures Program (COMP) to develop a measure of "knowledge and skills 
relevant to successful functioning in adult society" (Forrest, 1982, p. 11). 
Since its development, the COMP exam has been administered at least once on 
more than 500 college campuses, and it is used annually by approximately 100 
four-year instituti ons in the evaluation of their general education programs 
(American College Testing Program, 1987). 

The COMP exam is available in two forms: the Objective Test (consisting 
of multiple-choice questions) and the Composite Examination (consisting of 
multiple-choice items and exercises requiring students to write essays and 
record speeches). ACT staff report that the correlation between the two forms 
of the exam is .80, allowing the Objective Test to serve as a proxy for the 
Composite Examination (Forrest & Steele, 1982). Most institutions use the 
Objective Test because it is easier to administer and score (Banta, Lambert, 
Pike, Schmidhammer, & Schneider, 1987). 

The Objective test contains 60 multiple -choice questions, each with two 
correct answers. The questions are divided among 15 separately timed activi- 
ties drawing on material (stimuli) from television programs, radio broadcasts, 
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and print media. Students taking the COMP exam are instructed that there is a 
penalty for guessing (i.e., incorrect answers will be subtracted from their 
scores), but that leaving a question blank will not be counted against them. 

The combination of two correct answers for each item, the guessing penal- 
ty, and no penalty for not answering a question means that the score range for 
each of the 60 items is from -2 to 2 points. A score of -2 represents two 
incorrect answers, while a score of -1 represents one incorrect answer and one 
answer left blank. A score of 0 can represent either both answers left blank 
or one correct and one incorrect answer . A score of 1 represents one correct 
answers and a blank, and a score of 2 represents two correct answers. For 
convenience, scores for each item are recoded to produce a range from 0 to 4 
points, making the maximum possible score on the Objective Test 240 points and 
a chance score 120 points. 

In addition to a total score, the COMP exam provides three content 
subscores (Functioning within Social Institutions, Using Science and Technolo- 
gy, and Usxiig the Arts) and three process subscores (Communicating, Solving 
Problems, and Clarifying Values). Content subscores may be further subdivided 
based on the 15 stimulus activities (five activities for each content area). 
For each content subscore, two of the activities require identification or 
description, and three of the activities require explanation (Forrest & 
Steele, 1982). 

Process subscores can be subdivided into 20 skills (six each for Communi- 
cating and Clarifying Values, and eight skills for Solving Problems). The six 
skill areas for the Communicating subscore evaluate the ability to receive and 
send information from oral presentations, written m^cerials, and numeri- 
cal/graphic representations. The skill areas for Solving Problems and Clari- 
fying Values represent the skills of identification and analysis (Forrest & 
Steele, 1982). Because the 6 subscales of the COMP exam form a matrix using 
the same test questions, activities requiring identification and description 
correspond to the skills of identification, while activities requiring expla- 
nation correspond to the skills associated with analysis. 

The Dif Test 

While item response theory provides a superior method of detecting in- 
stances of differential item functioning than do traditional GLM procedures, 
the binary item response models typically used for this purpose are not appro- 
priate for the COMP exam with its five possible response categories for each 
question. Although scores on each question could be recoded to conform to a 
binary model (e.g., only giving credit for two correct answers), recoding the 
questions would change the nature of the COMP exam and sidestep the issue of 
whether the COMP exam, as used in performance funding, evidences differential 
item functioning for black and white students. 

The use of ordered scores (from 0 to 4) for each question on the COMP 
exam suggests that a polychotomous item response model would be more appropri- 
ate for analyzing this test. Samejima's (1969) model for graded responses 
uses a series of binary models to describe polychotomous data. The item 
response functions in the graded model represent the probability of a correct 
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response in a given category (k) and all higher categories (k~) . According to 
Thissen (1988), the probabilities associated with a particular response func- 
tion (P k _) can be represented mathematically as: 

P k . - 1 / (1 + exp[-a k _(0 • b k _)]} 

where a k _ is the slope of the function, b k _ is the threshold of the func- 
tion, and $ is the latent ability or achievement level of the respondent. 
Because the probability for the lowest response and all higher responses is 
unity, n response categories can be described by n-I functions. 

Thissen and Steinberg (1986) describe Samejima's model for graded re- 
sponses as a difference model because the probability of a given response (k) 
is the differences between the probability for the function k~ and the next 
highest function (m~) : 

p k - V - p- 

Figure 1 presents graded model response functions for a hypothetical 
question on the COMP exen. These functions depict two important assumptions 
of graded response models. First these models assume that responses are 
ordered (i.e., that 2 is greater than 1). If this assumption is not met at 
all levels of the latent ability/achievement variable, the difference formula! 
a will yield negative probabilities. The second assumption of the graded 
model is that the slopes of the functions are all equal (Thissen, 1988). 
Unequal slopes produce functions that will cross at some point on the abili- 
ty/achievement continuum, and the difference formula again will yield negative 
probabilities (Thissen & Steinberg, 1986). 



Insert Figure 1 about here 



The twin assumptions of ordered responses and unequal slopes parallel the 
assumptions of one -parameter binary item response models. In one parameter 
models, slopes for the items are assumed to be equal (usually 1.00) and only 
the thresholds (item difficulty levels) vary. Because of the wide variety if 
tests for differential item functioning that are available for one-parameter 
models, treating the response functions of the graded model as a series of 
one-parameter models is particularly helpful (Ironson, 1982; Thissen, 
Steinberg, & Wainer, 1988). 

Among the most popular tests for differential item functioning is the 
difficulty- shift statistic (Lord, 1977). This test makes use of a 
z statistic (i.e., a value for the standard normal distribution) and calcu- 
lates differences in difficulty values after equating parameters onto the same 
latent ability/achievement scale (Ironson, 1982). The difficulty-shift sta- 
tistic is defined as: 

z - (b x - b 2 ) / (SE X 2 + SE^) 1 ' 2 

where b x and b 2 are threshold (difficulty) parameters, and SE X and SE 2 are 
the standard errors for the difficulty parameters. 
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Application of the difficulty- shift statistic to graded models provides 
an empirical test of the assumption that thresholds (difficulty levels) are 
the same across subgroups. Nonsignificant results indicate that difficulty 
levels are similar across subgroups, while significant results indicate that 
the difficulty of achieving a given score is different for the subgroups. In 
t-.erms of the present research, differential item functioning is operationally 
defined as statistically significant differences in the threshold (difficulty) 
parameters for the response functions of blacks and whites on the 60 items of 
the COM? exam. 



The Data Analyses 

Analyzing students' test responses was a two-step process. First, re- 
sponse patterns for each item were fitted to the graded model using the 
MULTILOG computer program (Thissen, 1988). The responses of blacks and whites 
were analyzed separately, and the slopes of the response functions were fixed 
at 1.00. Five questions did not fit the model and were dropped from further 
analyses. Of these five questions, three involved the responses of black 
students and two involved the responses of white students. It is important to 
note that fixing the slopes at 1.00 was not the cause of misfitting models. 
For all five questions, the data d?d not represent ordered responses at any 
slope . 

For the 55 questions which did conform to the assumptions of a graded 
model, the second step in the data analysis involved using threshold parame- 
ters and their standard errors to calculate difficulty- shift coefficients. 
Because of the large number of comparisons being made (220), a conservative 
probability level (p < .0001) was used. The selection of this probability 
level for individual comparisons resulted in an overall probability levels of 
p < .05 for all comparisons. 

Interpretation of difficulty-shift results also was a multi-step process. 
First, results for all questions were examined and the predominant patterns of 
differential item functioning were identified. Second, the subscore matrix 
for the C0MP exam was use . to identify particular subscores with particularly 
pronounced rates of differential item functioning. Finally, the divisions of 
subscores identified previously were used to identify particular types of 
questions with consistently high rates of differential item functioning. 
Because of the overlap in these divisions, analyses were restricted to the 
identification and explanation skills of the content subscores and the oral, 
written, and math skills related to the Communicating subscore. 



Results 



Pattern? of Dig 

Results of the difficulty-shift analyses indicate that a substantial 
number of the questions on the C0MP exam function differently for blacks and 
whites. Table 1 presents the threshold (ditficuity) parameters and their 
standard errors for the scores of black and white students on the 60 items of 
the C0MP exam. Asterisks (*) are used to indicate those items for which it is 
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impossible to calculate threshold values. In addition, difficulty- shift (z) w 
scores are presented for each response function. Positive z scores identify 
those response functions that favor blacks, and negative z scores identify 
the functions that favor whites. Asterisks adjacent to the diff iculty-shift 
coefficients indicate which response functions have significantly different 
threshold parameters for blacks and whites. 



Insert Table 1 about here 



As previously noted, difficulty-shift coefficients could not be computed 
for five of the items on the COMP exam (3, 28, 29, 42, 58) because the data 
for these items do not conform to the assumptions of a graded model. Of the 
55 questions analyzes, 32 significantly favor whites and none significantly 
favor blacks. Examination of these 32 questions reveals that 11 of the ques- 
tions have substantial levels of difficulty-shift (significant differences for 
three or four threshold parameters) and 21 of the questions have moderate" 
levels of difficulty-shift (significant differences for one or two threshold 
parameters) . 

Figure 2 presents graphs of the four response functions for blacks and 
whites on a COMP question with a substantial level of diff iculty- shift (ques- 
tion 18). Each of the four graphs contrasts the response functions for blacks 
and whites on this question. The item depicted in Figure 1 uses the floor 
plan of a house as its stimulus and asks students to calculate building and 
energy costs for the house. Basic computational skills (multiplication and 
division) are required to answer this question. 



Insert Figure 2 about here 



An examination of the response functions depicted in Figure 2 clearly 
shows that the functions for blacks are shifted to the right. This shift 
indicates that question 18 is significantly more difficult for blacks than 
whites (i.e., black students with the same ability/achievement levels as white 
students are more likely to make lower scores on this question) . 

Figure 3 presents the response functions for a COMP item (question 55) 
with moderate levels of diff iculty- shift . Again, each of the four graphs 
contrasts probabilities for a given score and all higher scores for whites 
with similar probabilities for blacks. An examination of the graphs in Figure 
3 reveals that the response functions of blacks and whites are virtually 
identical for scores of one or greater and sccres of two or greater. However, 
blacks and whites differ significantly for the response functions representing 
scores of three or more and scores of four. 



Insert Figure 3 about here 
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It should be emphasized that the pattern identified in Figure 3 is repli- 
cated for all 18 instances of moderate difficulty-shift in which two response 
functions differ significantly. For the three questions for which only one 
response function is significantly different, that function always represents 
a score of four points. 



COMP Subscores 

A more detailed examination of the questions evidencing significant 
shifts in difficulty reveals that these questions are not evenly distributed 
across subscores. Table 2 presents the number and percentage of items for 
each content and process subscale with significant diff icultv-shif t coeffi- 
cients. This table also presents the same data broken down by the nine cpIIs 
of the content -by-process subscore matrix. 



Insert Table 2 about here 



An examination of the data in Table 2 indicates that the Functioning 
within Social Institutions (FSI) content subscale has a relatively low number 
of items with significant difficulty-shift values. Only six (30%) of the Item 
comprising this subscale produce significant shifts in threshold parameters, 
and none of these shifts occur for more than two threshold parameters. Inter- 
estingly, four of the five questions that did not meet the assumptions of a 
graded model are contained in this subscale. 

In contrast, the Using Science and Technology (US) subscale has a large 
number of items with significant difficulty-shift coefficients. Sixteen (80%) 
of the questions contained in this subscale prodtice significant difficulty- 
shift results, and six of the questions evidence shifts in three or more 
threshold parameters. 

Rates of difficulty-shift for the Using the Arts (UA) subscale are more 
moderate. Ten (50%) of the items in this subscale produce significant re- 
sults. For five of these ten items, significant difficulty-shift coefficients 
are present for three or more of the response functions. 

Concerning the process subscales, both Communicating (COM) and Solving 
Problems (SP) a relatively large number of questions produced significant 
differences in threshold parameters for blacks and whites. Twelve (67%) of 
the Communicating questions and thirteen (54%) of the Solving Problems ques- 
tions contain significant difficulty-shift coefficients. Furthermore, five of 
the questions comprising the Solving Problems subscale and two of the ques- 
tions comprising the Communicating subscale evidence significant shifts in 
three or more threshold parameters. 

The Clarifying Values (CV) subscale has the fewest instances of signifi- 
cant difficulty-shift results of any process subscale. Only seven (35%) of 
these questions produce significant difficulty-shift coefficients. However, 
four of these seven questions do evidence significant results for three or 
more response functions. 
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As one may surmise from the results for the individual content and pro- 
cess subscales, the incidence of shifts in difficulty levels is not evenly 
distributed over the nine cells of the content -by- process subs core matrix . 
All six items contained in the Using Science and Communicating cell evidence 
significant differences in the threshold parameters of blacks and whites. 
Similarly, six of the eight questions related to Using Science and Solving 
problems produce significant difficulty-shift results, as do four of the six 
Using Science and Clarifying Values questions. 

Four of the six Using the Arts and Communicating questions also show 
significant shifts in threshold parameters, as do five of the eight Using the 
Arts and Solving Problems questions. Only one Using the Arts and Clarifying 
Values questions has a statistically significant difficulty-shift coefficient, 
and difficulty- shift rates are stable across the three Functioning within 
Social Institutions cells. 



COMP Activities 

The table of specifications for the 15 COMP activities provides addition- 
al information on the types of questions evidencing significant shifts in 
threshold parameters for blacks and whites. Table 3 presents the number and 
percentage of items with significant difficulty-shift coefficients broken down 
by content area and whether the activities for that content are require iden- 
tification/description or explanation. 



Insert Table 3 about here 



Overall, activities requiring explanation are almost twice as likely to 
produce significant difficulty- shift results than are activities requiring 
identification/description (62% versus 33%). This tendency is most pronounced 
for the Functioning within Social Institutions content area where all six 
instances of difficulty shift occur in activities requiring explanation. 
Similarly, 9 of the 10 instance of difficulty-shift in the area of Using the 
Arts occur in activities requiring explanation. For the Using Science and 
Technology area, rates of difficulty-shift are extremely high both for activi- 
ties requiring identification/description (83%) and for activities requiring 
explanation (79%) . 

Results for the identification and analysis skills required in the 
Solving problems and Clarifying Values areas are identical to the results 
presented above because identification in the process areas corresponds to 
identification/ description in the content areas, and analysis corresponds to 
explanation. For the Communicating subscale, five (83%) of the six questions 
designed to evaluate students' abilities in sending and receiving numeric and 
graphic information produce significant difficulty-shift results. 

An examination of the types of stimuli students respond to in the various 
COMP activities also provides information about differences in the ways items 
function for blacks and whites. Table 4 presents the number and percentage of 



10 



9 

questions producing significant difficulty-shift coefficients broken down by 
each activity. In addition, descriptions of the stimulus materials used in 
these activities are included in the table. 



Insert Table 4 about here 



As would be expected from the high rates of dif f iculty-shJ f t for the 
Using Science and Technology subscale generally, all five acti ities related 
to the Using Science area produce rates of difficulty-shift of 50% or more. 
Particularly high rates of difficulty- shift are observed for Activity 2 (a 
television program on plant genetics), Activity 5 (a blueprint of an energy- 
efficient home), and Activity 11 (a radio news broadcast on the Strategic 
Defense Initiative). 

Several other activities in the areas of Functioning within Social Insti- 
tutions and Using the Arts also produce very high rates of difficulty- shift . 
These activities include the blueprint of a church (Activity 6) designed to 
measures skills related to Using the Arts, a satirical article on United 
States foreign policy (Activity 9) also designed to measure skills related to 
Using the Arts, and a radio news broadcast on marriage (Activity 10) designed 
to measure skills related to Functioning within Social Institutions. 



Discussion 

The principal findings of the present research can be summarized as 
follows: 

1. For 32 (58%) of the 55 questions on the COMP exam that 
were evaluated in this research, significant instances of 
differential item functioning (difficulty-shift) are 
present. For all of these questions, differences in 
threshold (difficulty) parameters favor white students, 
indicating that these COMP items tend to be more diffi- 
cult for biack students than whites students. 

2. Instances of differential item functioning are not evenly 
distributed among the six subscales of the COMP exam. For 
the content subscales, Using Science and Technology ques- 
tions have a very high rate of difficulty -shift , Using the 
Arts questions have a moderate rate of difficulty- shift, 
and Functioning with Social Institutions questions have a 
relatively low rate of difficulty-shift. For the process 
subscores both Communicating and Solving Problems ques- 
tions have moderate to high rates of difficulty shift, 
while Clarifying Values questions have a relative low rate 
of difficulty-shift. 

3. When questions are categorized on the basis of the content 
skills they assess (i.e., identification versus explana- 
tion) , the results of this research clearly show that 
questions designed to assess explanation skills produce 
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much higher rates of differential item functioning than do 
questions designed to assess skills related to identifica- 
tion and description. In addition, questions designed to 
assess mathematics skills produce very high rates of 
difficulty- shift. 

4. Setting aside the extremely high rates of differential 

item functioning (difficulty-shift) for the Using Science 
and Technology subscale, examination of the nature of the 
stimulus materials used in the 15 COMP activities shows 
that activities which rely on blueprints, require the 
interpretation of satire, or use a radio news format 
produce extremely high levels of difficulty- shift . 

In reviewing these results it is important to note that differences in 
the functioning of COMP items do not automatically lead to the conclusion that 
the COMP exam is biased, in a legal sense, against blacks. To be sure, these 
results may be the produce of bias in test construction; however, they also 
may be the product of differences in the educational experiences of blacks and 
whites. To answer this question, studies of black and white students with 
similar educational profiles need to be conducted. 

The results of the present research also are limited in their 
generalizability . The fact that the data are from one institution, coupled 
with the relatively small number of black students in the sample, makes gener- 
alizations beyond UTK impossible. However, these results are sufficiently 
compelling to warrant extensive research across all public colleges and uni- 
versities in Tennessee. 

Despite these limitations, the results of the present research clearly 
indicate that items on the COMP exam function differently for black and white 
students at the University of Tennessee, Knoxville. For whatever reason, a 
majority of the items on this test are more difficult for blacks than whites. 
Given black and white students of equal abi lity /achievement , the black student 
will not perform as well as the white student. Stated differently, black 
students must have higher levels of ability/achievement than white students to 
make the same scores as whites on the COMP exam. 

If these results can be generalized to other institutions in Tennessee, 
historically black students and ouher colleges with large black enrollments 
are at a competitive disadvantage with regard to performance funding. Specif- 
ically, the doll ars awarded under Standard III of the Performance funding 
guidelines will be lower fo r these institutions than if thev had large white 
student populations . Even if the results of the present research cannot be 
generalized beyond UTK, they indicate that efforts to increase the size of 
black enrollment (and black retention) at UTK are inherently in conflict with 
efforts to improve scores on the COMP exam. 
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0.176 


0.116 


-12.821 


* 






-1.252 


0.033 


0.525 


0.113 


-15.095 


* 


(56) 




-4.854 


0.100 


-4.430 


0.3*2 


-1.159 








-4.468 


0.084 


-4.042 


0.297 


-1.380 








-0.078 


0.025 


0.176 


0.116 


-2.141 








0.213 


0.025 


0.525 


0.113 


-2.696 







Tmpif 


b 


SE 


b 


SE 


82 


(57) 




-7.235 


0.461 


-6.731 


1.334 


-0 357 






-7.121 


0.451 


-5.585 


0.729 


-1.792 






J . DO / 


0. 069 


-2. 143 


0.158 


-8.955 








U • U 54 


-1. 505 


0. 134 


-10.527 


(58) 


1- 


-5.266 


0.123 


* 


* 


* 






-4.961 


0.106 


* 


* 


* 






— A IQi 


A A *i C 

0.025 


Ik 




* 






— n AR*V 


0.025 


It 


* 


* 


(59) 




-5.095 


0. 112 


—4 . 2 X5I 


U. 424 


-2 . 555 






-4.157 


0.073 


-3.576 


0.247 


-2.256 






-1.078 


0.027 


-0.805 


0.117 


-2.274 






-0.730 


0.026 


-0.298 


0.112 


-3.757 


(60) 




-4.364 


0.079 


-4.319 


0.349 


-0.126 






-3.526 


0.055 


-3.522 


0.247 


-0.016 






-0.498 


0.026 


-0.390 


0.111 


-0.947 






0.102 


0.026 


0.217 


0.111 


-1.009 
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Table 2 

Rates of Differential Item Functioning Given Subscales on the COMP Exam 



CONTENT SUBSCALES 



Process 

Subscales FSI US UA TOTAL 



COM 2 6 4 12 

33% 100% 67% 67% 

SP 2 6 5 13 

25% 75% 62% 54% 

CV 2 4 17 

33% 67% 17% 39% 



TOTAL 6 16 10 

30% 80% 50% 



ERIC 



25 



• 



18 



Table 3 

Rates of Difficulty-Shift Given the Skills Required for Content Activities 



SKILL 



Content 
Subscore 



Identification Explanation 



FSI 



0 

0% 



6 

43% 



US 



5 
83% 



11 

79% 



UA 



1 
17% 



9 
64% 



TOTAL 



6 
33% 



26 
62% 



9 

ERIC 



24 
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Table 4 

Rates of Difficulty-Shift Given the Type of Stimulus Material Used in 
Content Activities 



DIFFICULTY- SHIFT 



Activity 


Description 


Number 


Percentage 


1 


Television Film - FSI 


1 


33% 


2 


Television Film - US 


3 


100% 


3 


Television Film - UA 


1 


33% 


k 


Print Article - FSI 


3 


50% 


5 


Print Blueprint - US 


6 


100% 


6 


Print Blueprint - UA 


5 


83% 


7 


Print Letter - FSI 


0 


0% 


8 


Print Advertisement - US 


2 


50% 


9 


Print Satirical Article - UA 


3 


75% 


10 


Radio News Show - FSI 


3 


75% 


11 


Radio News Show - US 


3 


75% 


12 


Radio Music Show - UA 


1 


25% 


13 


Printed Scenario - FSI 


0 


0% 


14 


Printed Scenario - US 


2 


67% 


15 


Printed Scenario and Slide - UA 


0 


0% 
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Figure Captions 

Figure 1. Graded Response Functions for a Hypothetical COMP Item. 
Fi gure 2 . Response Functions for Question 18. 
Fi gure 3 . Response Functions for Question 55. 
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GRADED RESPONSE FUNCTIONS 
FOR A HYPOTHETICAL COMP ITEM 



PROBABILITY 
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THETA 

— r -+-2- -*-3" -°-4 m 
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RESPONSE FUNCTIONS FOR QUESTION 18 
(SCORES OF 1 OR GREATER) 




RESPONSE FUNCTIONS FOR QUESTION 18 
(SCORES OF 3 OR GREATER) 



PROBABILITY 
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RESPONSE FUNCTIONS FOR QUESTION 18 
(SCORES OF 2 OR GREATER) 




(SCORES OF 4 OR GREATER) 
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THETA 
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RESPONSE FUNCTIONS FOR QUESTION 55 
(SCORES OF 1 OR GREATER) 



PROBABILITY 




THETA 



WHITES BLACKS 



RESPONSE FUNCTIONS FOR QUESTION 55 
(SCORES OF 3 OR GREATER) 



PROBABILITY 

1 1 




THETA 
— WHITES -*- BLACKS 
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RESPONSE FUNCTIONS FOR QUESTION 55 
(SCORES OF 2 OR GREATER) 



PROBABILITY 
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RESPONSt FUNCTIONS FOR QUESTION 55 
(SCORES OF 4 OR GREATER) 

PROBABILITY 
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